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Introduction: As NASA implements the na- 
tion’s Vision for Space Exploration [1] to return to 
the moon and travel to Mars, new considerations will 
be be given to the processes governing design and 
operations of manned spaceflight. New objectives 
bring new technical challenges; Safety will drive 
many of these decisions. 

Historical Context: For the Apollo program, 
safety requirements were individual standards for 
individual subsystems or components. 

The Space Shuttle Program (SSP) combined 
these requirements into a uniform standard, “Fail 
Operational, Fail Safe” or FO/FS. If any one failure 
occurs then the Shuttle can continue to operate and 
complete all flight mission objectives. Following a 
second failure, FO/FS specifies that the vehicle is 
Safe, able to return home safely, though may not be 
able to complete all mission objectives. 

Payload Safety Process instituted Two Fault Tol- 
erant (2FT) Requirements to add an additional layer 
of protection between Crew/Shuttle and Payloads. 

The International Space Station (ISS) based their 
safety analysis on the Payload standard of 2FT, mo- 
tivated by the lack of ground servicing of the ISS 
following failures of safety-critical hardware. The 
Shuttle has the option to perform an emergency 
deorbit, in the event of a major failure. There is no 
similar contingency option for the ISS. 

Safety Requirements Allocation. System engi- 
neering calls for breakdown of requirements by 
function and allocation of portions of requirements 
to separate subsystems. Unfortunately, safety re- 
quirements can not follow this model. Safety re- 
quirements generally are required to be applied to all 
of the lowest level specifications intact. 

Achieving 2FT. There are numerous ways to 
achieve satisfactory compliance of requirement. One 
could use independent, triple -redundant must-work 
systems or independent three inhibit most-not-work 
systems throughout the vehicle. Unfortunately, such 
design standard is not possible due to restrictions of 
weight and volume. Additionally, it is not necessary 
as there are other ways to meet 2FT: Use of Unlike 
Redundancies. Separate systems working together 
can provide an overall higher level of safety. Use of 
Other Available Margins, taking advantage in mar- 
gin in a different system to meet 2FT requirement. 

Non-Compliance of 2FT. Alternate approaches, 
when necessitated by design are acceptable. 

Competing Must-Work and Most-Not-Work 
Functions. Consider rendezvous. No single or dou- 
ble events can result in collision. Crew controlled 
flight via joystick is inherently vulnerable to crew 


error. Instead, training, design of approach corridor 
and control of approach speed allow crew to recog- 
nize errors and adjust. 

Equivalent Safety. This is used when risk to 
Crew/Vehicle is very low with just Single Failure 
Tolerance Implemented. The emphasis is on other 
controlling factors and not just probability numbers. 
For example, a single failure tolerant system with 
large time to effect is considered acceptable ‘equiva- 
lent’ risk to a 2FT system because there is plenty of 
time to avert undesired effect after the failure using 
operational controls. 

Design for Minimum Risk (DMR): From the 
ISS Safety Requirements [2]: Design for minimum 
risk are areas where hazards are controlled by speci- 
fication requirements that specifiy safety related 
properties and characteristics of the design that have 
been baselined by the ISS program requirements 
rather than failure tolerance criteria. The failure tol- 
erance criteria . . . shall only be applied to these de- 
signs as necessary to assure that credible failures that 
may affect the design do not invalidate the safety 
properties of the design. Examples are mechanisms, 
structures, glass, pressure vessels, pressurized lines 
and fittings, functional pyrotechnic devices, material 
compatibility, flammability, etc. 

Safety Analysis Documents. Safety engineering 
analyses are documented in Hazard Reports, Failure 
Mode Effects Analysis (FMEA) and the Critical 
Items List (CIL) [3], 

Hazard Reports. Hazard Reports [4] capture the 
risks that do not meet the FO/FS requirement. Each 
cause is assigned a Severity and Likelihood of Oc- 
currence, and corresponding Hazard Classification. 

The Severity level is an assessment of the worst- 
case effects of a hazard for a given cause. By defini- 
tion, Catastrophic severity is a hazard which could 
result in a mishap causing fatal injury to personnel 
and/or loss of one or more major elements of the 
flight vehicle or ground facility. Critical severity is a 
hazard which could result in serious injury to per- 
sonnel and/or damage to flight or ground equipment 
which would cause mission abort or a significant 
program delay. Marginal severity is a hazard which 
could result in a mishap of minor nature inflicting 
first-aid injury to personnel and/or damage to flight 
or ground equipment which can be tolerated without 
abort or repaired without significant program delay. 

The Likelihood of Occurrence assesses the prob- 
ability that the worst-case hazard will take place, 
with the controls in place. Probable: expected to 
happen in the life of the program; Infrequent: could 
happen in the life of the program, controls have sig- 
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Figure 1: Sample Hazard Report Risk Matrix. Identifies 
the Likelihood of Occurance and Severity of each Cause. 

nificant limitations or uncertainties; Remote: could 
happen in the life of the program, but is not ex- 
pected, controls have minor limitations or uncertain- 
ties; Improbable: extremely remote possibility to 
happen in the life of the program; there are strong 
controls in place. 

The risk matrix places each cause into a grid in a 
matrix, with the Likelihood of Occurrence on the 
vertical axis and the Severity on the horizontal axis. 
For the sample Risk Matrix with three causes (Fig. 
1), two causes (A&C) are Remote/Catastrophic; one 
cause (B) is Improbable/Catastrophic. 

Failure Mode Effects Analysis (FMEA) and 
the Critical Items List (CIL). FMEA/CILs identify 
specific potential hardware failures and describe the 
root cause of failures for components and subsys- 
tems. [5] 

Each FMEA/CIL is assigned both a ‘functional 
criticality/hardware criticality’ as follows: ‘1/1’ Sin- 
gle failure which could result in loss of life or vehi- 
cle; ‘1R/2’ Redundant hardware item(s), all of which 
failed, could cause loss of life or vehicle. First fail- 
ure would result in loss of mission, or the next fail- 
ure could cause loss of life or vehicle; ‘1R/3’ Re- 
dundant hardware item(s), all of which failed, could 


cause loss of life or vehicle. First failure has no ef- 
fect on mission; second failure may result in loss of 
mission; ‘2/2’ Single failure which could result in 
loss of mission; ‘2R/3’ Redundant hardware item(s), 
all of which failed, could cause loss of mission. First 
failure has no effect; ‘3/3’ All others. 

FMEAs are included on the CIL if they are of 
criticality ‘1/1’, ‘1R/2’, and, for some cases, ‘1R/3’. 

Spacecraft Operations Considerations. During 
Shuttle Flight Operations, the Mission Management 
Team (MMT) assesses on-orbit anomalies. Engi- 
neering teams review relevant FMEA/CILs and Haz- 
ard Reports. The relative positions in the risk matrix 
are used to guide actions to protect against the high- 
est risk. Each anomaly is assigned a position on the 
Anomaly Risk Matrix, shown in Figure 2. The Fault 
Tolerance Remaining compared to the Next Failure 
Consequence are tracked. Subseqent to the anomaly, 
it is frequent that the position in this matrix moves. 
That can be because the mission circumstances 
change and the next failure consequence becomes 
less significance or it can be moved because more 
flight data is obtained or further analysis has been 
performed. 

ISS operations are handled slightly differently. 
Following the an anomaly, the engineering investi- 
gation team assesses the Magnitude of Potential 
Consequences compared to the Likelihood Probabil- 
ity the corresponding condition or event will happen. 
Subsequently, the team presents options and iden- 
tifes the associated risk and the corresponding reli- 
ability after implementation. 

Conclusion NASA has continued to develop 
processes for design and operations to flight safety. 
The Vision for Space Exploration will bring new 
technical challenges and necessitate new apporaches 
to design and operations. 
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